An Adaptive Partitional Clustering Method for Categorical Attribute Using K-medoid

نویسنده

  • A. Selvakumar
چکیده

Abstract— partitioning a large set of objects into homogeneous clusters is a fundamental operation in data mining. The operation is needed in a number of data mining tasks such as unsupervised classification and data summation as well as segmentation of large heterogeneous data sets into smaller homogeneous subsets that can be easily managed, separately modeled and analyzed. Clustering is a popular approach used to implement this operation. Partitional clustering attempts to directly decompose the data set into a set of disjoint clusters. More specifically, they attempt to determine an integer number of partitions that optimize as certain criterion function. The criterion function may emphasize the local or global structure of the data and its optimization is an iterative procedure. The intention to analyze the fact that partitional clustering algorithms performs efficiently for numerical attribute rather than categorical attribute. To analyze the algorithm best suits for a matrix data. They work with larger datasets with many attributes. For analysis the Iris dataset has been retrieved from UCI data repository and used in K-Medoid. The outcome of the algorithm is the partition of clusters which can also be visualized in graphical format. The cluster figures differentiate the cluster in various colors with the centroid measure distinctly. Finally it has been determined that K-Medoid is the better partitional algorithm.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Genetic Algorithms in Partitional Clustering: A Comparison

Three approaches to partitional clustering using genetic algorithms (GA) are compared with k-means and the EM algorithm for three real world datasets (Iris, Glass and Vowel). The GA techniques differ in their encoding of the clustering problem using either a class id for each object (GAIE), medoids to assign objects to the class associated with the nearest medoid (GAME), or parameters for multi...

متن کامل

Computation of Initial Modes for K-modes Clustering Algorithm Using Evidence Accumulation

Clustering accuracy of partitional clustering algorithm for categorical data depends primarily on the choice of initial data points to instigate the clustering process and hence the clustering results cannot be generated and repeated consistently. In this paper we present an approach to compute initial modes for K-mode partitional clustering algorithm to cluster categorical data sets. Here we u...

متن کامل

A cluster centers initialization method for clustering categorical data

Keywords: The k-modes algorithm Initialization method Initial cluster centers Density Distance a b s t r a c t The leading partitional clustering technique, k-modes, is one of the most computationally efficient clustering methods for categorical data. However, the performance of the k-modes clustering algorithm which converges to numerous local minima strongly depends on initial cluster centers...

متن کامل

Differential evolution and particle swarm optimisation in partitional clustering

In recent years, many partitional clustering algorithms based on genetic algorithms (GA) have been proposed to tackle the problem of finding the optimal partition of a data set. Surprisingly, very few studies considered alternative stochastic search heuristics other than GAs or simulated annealing. Two promising algorithms for numerical optimization, which are hardly known outside the heuristic...

متن کامل

Context-Based Distance Learning for Categorical Data Clustering

Clustering data described by categorical attributes is a challenging task in data mining applications. Unlike numerical attributes, it is difficult to define a distance between pairs of values of the same categorical attribute, since they are not ordered. In this paper, we propose a method to learn a context-based distance for categorical attributes. The key intuition of this work is that the d...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013